Word-Error Correction of Continuous Speech Recognition Based on Normalized Relevance Distance
نویسندگان
چکیده
In spite of the recent advancements being made in speech recognition, recognition errors are unavoidable in continuous speech recognition. In this paper, we focus on a word-error correction system for continuous speech recognition using confusion networks. Conventional N -gram correction is widely used; however, the performance degrades due to the fact that the N -gram approach cannot measure information between long distance words. In order to improve the performance of theN -gram model, we employ Normalized Relevance Distance (NRD) as a measure for semantic similarity between words. NRD can identify not only co-occurrence but also the correlation of importance of the terms in documents. Even if the words are located far from each other, NRD can estimate the semantic similarity between the words. The effectiveness of our method was evaluated in continuous speech recognition tasks for multiple test speakers. Experimental results show that our error-correction method is the most effective approach as compared to the methods using other features.
منابع مشابه
Error correction of automatic speech recognition based on normalized web distance
In this paper, we focus on the problems associated with error correction of automatic speech recognition (ASR) based on confusion networks. The problems discussed are the availability of corpus in terms of calculating the semantic score and performance degradation for error correction usingN -gram due to the null transitions in the confusion networks. In attempt to solve these problems, first, ...
متن کاملPerformance Improvement of Dysarthric Speech Recognition Using Context-Dependent Pronunciation Variation Modeling Based on Kullback-Leibler Distance
In this paper, we propose context-dependent pronunciation variation modeling based on the Kullback-Leibler (KL) distance for improving the performance of dysarthric automatic speech recognition (ASR). To this end, we construct a triphone confusion matrix based on KL distances between triphone models, and build a weighted finite state transducer (WFST) from the triphone confusion matrix. Then, d...
متن کاملTwo-step correction of speech recognition errors based on n-gram and long contextual information
This paper presents a fully automatic word error correction on a confusion network that makes use of long contextual information. However, a problem with long contextual information is that improvement of the recognition accuracy is minimal because of the word errors surrounding words. In this paper, recognition errors are first reduced by error correction using N gram features. After that, the...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملDysarthric Speech Recognition Based on Error-Correction in a Weighted Finite State Transducer Framework
In this paper, a dysarthric speech recognition error-correction method in a weighted finite state transducer (WFST) framework is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, pronunciation variation models are constructed from a context-dependent confusion matrix based on a weighted Kullback-Leibler (KL) distance between triphones. Then, a WF...
متن کامل